link suggestion
https://gyazo.com/ae2fa8e23ef50298e6ba7a6f16a7a028
Isn't keyphrase extraction what is needed?
Maybe we need a "link suggestion" so to speak?
Document set and one new document entered
Output "link by common keyphrase" from new document to document set
difference
keyphrase extraction
given a document
A short string is obtained.
We call this short string a "key phrase.
You may create links after the fact between documents where keyphrases are common, but you don't care about that at the keyphrase extraction stage.
link suggestion
The main purpose is to suggest link. The set of documents to be linked is given from the beginning
Scores links based on their usefulness as links, rather than scoring key phrases
For example, a link with a very high number of occurrences is not useful and will score lower.
This depends on the use case.
The "occurrence" of "join if it occurs twice" in RAKE can be interpreted as extending to a set of documents, not just a new document. relevance
For example, in the chat interactive use case, "logs" correspond to "existing documents".
We don't just look at the latest posts and extract key phrases.
Use case to suggest links from a newly written document to a document stored in Scrapbox
---
This page is auto-translated from /nishio/リンクサジェスト. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.